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Abstract 


Persuasion, defined as the act of exploiting an informational advantage in order to effect the decisions 
of others, is ubiquitous. Indeed, persuasive communication has been estimated to account for almost a 
third of all economic activity in the US. This paper examines persuasion through a computational lens, 
focusing on what is perhaps the most basic and fundamental model in this space: the celebrated Bayesian 
persuasion model of Kamenica and Gentzkow 13411 . Here there are two players, a sender and a receiver. 
The receiver must take one of a number of actions with a-priori unknown payoff, and the sender has 
access to additional information regarding the payoffs of the various actions for both players. The sender 
can commit to revealing a noisy signal regarding the realization of the payoffs of various actions, and 
would like to do so as to maximize her own payoff in expectation assuming that the receiver rationally 
acts to maximize his own payoff. When the payoffs of various actions follow a joint distribution (the 
common prior), the sender’s problem is nontrivial, and its computational complexity depends on the 
representation of this prior. 

We examine the sender’s optimization task in three of the most natural input models for this problem, 
and essentially pin down its computational complexity in each. When the payoff distributions of the 
different actions are i.i.d. and given explicitly, we exhibit a polynomial-time (exact) algorithmic solution, 
and a “simple” (1 — 1/ e)-approximation algorithm. Our optimal scheme for the i.i.d. setting involves an 
analogy to auction theory, and makes use of Border’s characterization of the space of reduced-forms for 
single-item auctions. When action payoffs are independent but non-identical with marginal distributions 
given explicitly, we show that it is #P-hard to compute the optimal expected sender utility. In doing so, 
we rule out a generalized Border’s theorem, as defined by Gopalan et al S, for this setting. Finally, 
we consider a general (possibly correlated) joint distribution of action payoffs presented by a black box 
sampling oracle, and exhibit a fully polynomial-time approximation scheme (FPTAS) with a bi-criteria 
guarantee. Our FPTAS is based on Monte-Carlo sampling, and its analysis relies on the principle of 
deferred decisions. Moreover, we show that this result is the best possible in the black-box model for 
information-theoretic reasons. 


‘Supported in part by NSF CAREER Award CCE-1350900. 
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1 Introduction 

“One quarter of the GDP is persuasion.’ 


This is both the title, and the thesis, of a 1995 paper by McCloskey and Klamer ll39ll . Since then, 
persuasion as a share of economic activity appears to be growing — a more recent estimate places the figure 
at 30% 10] ■ As both papers make clear, persuasion is intrinsic in most human endeavors. When the tools of 
“persuasion” are tangible — say goods, services, or money — this is the domain of traditional mechanism 
desim, which steers the actions of one or many self-interested agents towards a designer’s objective. What 


m 


and much of the relevant literature refer to as persuasion, however, are scenarios in which the power 
to persuade derives from an informational advantage of some party over others. This is also the sense 
in which we use the term. Such scenarios are increasingly common in the information economy, and it is 
therefore unsurprising that persuasion has been the subject of a l^ge body of work in recent years, rnotivated 
by domains as varied as auctions |0, 25, 2^llol] . advertising lE, 33, 17], voting iH], security 146, ^], multi¬ 


armed bandits I137L13811 . medical research Il35l] . and financial regulation |128 



(For an empirical survey 


of persuasion, we refer the reader to l]2lh '). What is surprising, however, is the lack of systematic study of 
persuasion through a computational lens; this is what we embark on in this paper. 

In the large body of literature devoted to persuasion, perhaps no model is more basic and fundamental 
than the Bayesian Persuasion model of Kamenica and Gentzkow 041] . generalizing an earlier model by 
Brocas and Carrillo il4ll . Here there are two players, who we call the sender and the receiver. The receiver 
is faced with selecting one of a number of actions, each of which is associated with an a-priori unknown 
payoff to both players. The state of nature, describing the payoff to the sender and receiver from each action, 
is drawn from a prior distribution known to both players. However, the sender possesses an informational 
advantage, namely access to the realized state of nature prior to the receiver choosing his action. In order to 
persuade the receiver to take a more favorable action for her, the sender can commit to a policy, often known 
as an information structure or signaling scheme, of releasing information about the realized state of nature to 
the receiver before the receiver makes his choice. This policy may be simple, say by always announcing the 
payoffs of the various actions or always saying nothing, or it may be intricate, involving partial information 
and added noise. Crucially, the receiver is aware of the sender’s committed policy, and moreover is rational 
and Bayesian. We examine the sender’s algorithmic problem of implementing the optimal signaling scheme 
in this paper. A solution to this problem, i.e., a signaling scheme, is an algorithm which takes as input the 
description of a state of nature and outputs a signal, potentially utilizing some internal randomness. 

1.1 Two Examples 


To illustrate the intricacy of Bayesian Persuasion, Kamenica and Gentzkow ll34l] use a simple example in 
which the sender is a prosecutor, the receiver is a judge, and the state of nature is the guilt or innocence 
of a defendant. The receiver (judge) has two actions, conviction and acquittal, and wishes to maximize 
the probability of rendering the correct verdict. On the other hand, the sender (prosecutor) is interested 
in maximizing the probability of conviction. As they show, it is easy to construct examples in which the 
optimal signaling scheme for the sender releases noisy partial information regarding the guilt or innocence 
of the defendant. For example, if the defendant is guilty with probability the prosecutor’s best strategy 
is to claim “guilt” whenever the defendant is guilty, and also claim “guilt” just under half the time when 
the defendant is innocent. As a result, the defendant will be convicted whenever the prosecutor claims 
“guilt” (happening with probability just under |), assuming that the judge is fully aware of the prosecutor’s 
signaling scheme. We note that it is not in the prosecutor’s interest to always claim “guilt”, since a rational 
judge aware of such a policy would ascribe no meaning to such a signal, and render his verdict based solely 
on his prior belief — in this case, this would always lead to acquittalQ 


*In other words, a signal is an abstract object with no intrinsic meaning, and is only imbued with meaning by virtue of how it is 
used. In particular, a signal has no meaning beyond the posterior distribution on states of nature it induces. 
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A somewhat less artificial example of persuasion is in the context of providing financial advice. Here, 
fhe receiver is an invesfor, actions correspond to stocks, and the sender is a stockbroker or financial adviser 
with access to stock return projections which are a-priori unknown to the investor. When the adviser’s 
commission or return is not aligned with the investor’s returns, this is a nontrivial Bayesian persuasion 
problem. In fact, interesting examples exist when stock returns are independent from each other, or even 
i.i.d. Consider the following simple example which fits into the i.i.d. model considered in Section (3) there 
are two stocks, each of which is a-priori equally likely to generate low (L), moderate (M), or high (H) 
short-term returns to the investor (independently). We refer to L/M/H as the types of a stock, and associate 
them with short-term returns of 0, 1 -|- e, and 2 respectively. Suppose, also, that stocks of type L or H are 
associated with poor long-term returns of 0; in the case of H, high short-term returns might be an indication 
of volatility or overvaluation, and hence poor long-term performance. This leaves stocks of type M as the 
only solid performers with long-term returns of 1. Now suppose that the investor is myopically interested in 
maximizing short-term returns, whereas the forward-looking financial adviser is concerned with maximizing 
long-term returns, perhaps due to reputational considerations. Simple calculation shows that providing full 
information to the myopic investor results in an expected long-term reward of as does providing no 
information. An optimal signaling scheme, which guarantees that the investor chooses a stock with type 
M whenever such a stock exists, is the following: when exactly one of the stocks has type M recommend 
that stock, and otherwise recommend a stock uniformly at random. A simple calculation using Bayes’ rule 
shows that the investor prefers to follow the recommendations of this partially-informative scheme, and it 
follows that the expected long-term return is |. 

1.2 Results and Techniques 

Motivated by these intricacies, we study the computational complexity of optimal and near-optimal persua¬ 
sion in the presence of multiple actions. We first observe that a linear program with a variable for each 
(state-of-nature, action) pair computes a description of the optimal signaling scheme. However, when ac¬ 
tion payoffs are distributed according to a joint distribution — say exhibiting some degree of independence 
across different actions — the number of states of nature may be exponential in the number of actions; in 
such settings, both the number of variables and constraints of this linear program are exponential in the 
number of actions. It is therefore unsurprising that the computational complexity of persuasion depends 
on how the prior distribution on states of nature is presented as input. We therefore consider three natural 
input models in increasing order of generality, and mostly pin down the complexity of optimal and near- 
optimal persuasion in each. Our first model assumes that action payoffs are drawn i.i.d. from an explicitly 
described marginal distribution. Our second model considers independent yet non-identical actions, again 
with explicitly-described marginals. Our third and most general model considers an arbitrary joint distribu¬ 
tion of action payoffs presented by a black-box sampling oracle. In proving our results, we draw connections 
to techniques and concepts developed in the context of Bayesian mechanism design (BMD), exercising and 
generalizing them along the way as needed to prove our results. We mention some of these connections 
briefly here, and elaborate on the similarities and differences from the BMD literature in Appendix lAl 

We start with the i.i.d model, and show two results: a “simple” and polynomial-time ^^-approximate 
signaling scheme, and a polynomial-time implementation of the optimal scheme. Both results hinge on a 
“symmetry characterization” of the optimal scheme in the i.i.d. setting, closely related to the symmetrization 
result from BMD by lEoll but with an important difference which we discuss in Appendix Our “simple” 
scheme decouples the signaling problem for the different actions and signals independently for each. This 
result implies that signaling in this setting can be “distributed” among multiple non-coordinating persuaders 
without much loss. Our optimal scheme involves a connection to Border’s characterization of the space 
of feasible reduced-form auctions lEEli, as well as its algorithmic properties Qi. This connection 
involves proving a correspondence between “symmetric” signaling schemes and a subset of “symmetric” 
single-item auctions; one in which actions in persuasion correspond to bidders in an auction. 


2 


Next, we consider Bayesian persuasion with independent non-identical actions. One might expect that 
the partial correspondence between signaling schemes and single-item auctions in the i.i.d. model gen¬ 
eralizes here, in which case Border’s theorem — which extends to single-item auctions with independent 
non-identical bidders — would analogously lead to polynomial time algorithm for persuasion in this setting. 
However, we surprisingly show that this analogy to single-item auctions ceases to hold for non-identical 
actions: we prove that there is no generalized Border’s theorem, in the sense of Gopalan et al. Oflil . for per¬ 
suasion with independent actions. Specifically, we show that it is #P-hard to exactly compute the expected 
sender utility for the optimal scheme, ruling out Border’s-theorem-lik e ap proaches to this problem unless 
the polynomial hierarchy collapses. Our proof starts from the ideas of isdl . but our reduction is much more 
involved and goes through the membership problem for an implicit polytope which encodes a #P-hard prob¬ 
lem — we elaborate on these differences in Appendix El We note that whereas we do rule out computing an 
explicit representation of the optimal scheme which permits evaluating optimal sender utility, we do not rule 
out other approaches which might sample the optimal scheme “on the fly” in the style of Myerson’s optimal 
auction 04 ill — we leave the intriguing question of whether this is possible as an open problem. 

Finally, we consider the black-box model with general distributions, and prove essentially-matching pos¬ 
itive and negative results. For our positive result, we exhibit fully polynomial-time approximation scheme 
(FPTAS) with a bicriteria guarantee. Specifically, our scheme loses an additive e in both expected sender 
utility and incentive-compatibility (as defined in Section |2l), and runs in time polynomial in the number of 
actions and Our negative results show that this is essentially the best possible for information-theoretic 
reasons: any polynomial-time scheme in the black box model which comes close to optimality must signif¬ 
icantly sacrifice incentive compatibility, and vice versa. We note that our scheme is related to some prior 
work on BMD with black-box distributions jld 14511 . but is significantly simpler and more efficient: instead 
of using the ellipsoid method to optimize over “reduced forms”, our scheme simply solves a single linear 
program on a sample from the prior distribution on states of nature. Such simplicity is possible in our setting 
due to the different notion of incentive compatibility in persuasion, which reduces to incentive compatibility 
on the sample using the principle of deferred decisions. We elaborate on this connection in Appendix El 

We remark that our results suggest that the differences between persuasion and auction design serve as 
a double-edged sword. This is evidenced by our negative result for independent model and our “simple” 
positive result for the black-box model. 

1.3 Additional Discussion of Related Work 

To our knowledge, Brocas and Carrillo Q were the first to explicitly consider persuasion through informa¬ 
tion control. They consider a sender with the ability to costlessly acquire information regarding the payoffs 
of the receiver’s actions, with the stipulation that acquired information is available to both players. This 
is technically equivalent to our (and Kamenica and Gentzkow’s 041] ') informed sender who commits to a 
signaling scheme. Brocas and Carrillo restrict attention to a particular setting with two states of nature and 
three actions, and characterize optimal policies for the sender and their associated payoffs. Kamenica and 
Gentzkow’s 041] Bayesian Persuasion model naturally generalizes il4l] to finite (or infinite yet compact) 
states of nature and action spaces. They establish a number of properties of optimal information structures 
in this model; most notably, they characterize settings in which signaling strictly benefits the sender in terms 
of the convexity/concavity of the sender’s payoff as a function of the receiver’s posterior belief. 

Since 11141] and 041] . an explosion of interest in persuasion problems followed. The basic Bayesian 
persuasion model underlies, or is closely related to, recent work in a number of different domains: price 
discrimination by Bergemann et al. 11 Kj]. a dvertising by Chakraborty and HarbaughJ ITO, security games 
by Xu et al. 11461] and Rabinovich et al. 11421] . multi-armed bandits by Kremer et al. 071] and Mansour et al. 
OSn . medical research by Kolotilin 051] . and financial regulation by Gick and Pausch OSH and Goldstein 
and Leitner 11291] . Generalizations and variants of the Bayesian persuasion model have also been considered: 
Gentzkow and Kamenica ll2^ consider multiple senders, Alonso and Camara |3l consider multiple receivers 
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in a voting setting, Gentzkow and Kamenica iTm consider costly information acquisition, Rayo and Segal 


1430 consider an outside option for the receiver, and Kolotilin et al. m considers a receiver with private 
side information. 

Optimal persuasion is a special case of information structure design in games, also known as signal¬ 
ing. The space of information structures, and their induced equilibria, are characterized by Bergemann and 
Morris l|^. Recent work in the CS community has also examined the design of information structures algo¬ 
rithmically. Work by Emek et al. 12411 . Miltersen and Sheffet ll40ll . Guo and Deligkas ll32ll . and Dughmi et al. 
ll23n . examine optimal signaling in a variety of auction settings, and presents polynomial-time algorithms 
and hardness results. Dughmi 12211 exhibits hardness results for signaling in two-player zero-sum games, and 
Cheng et al. lllSn present an algorithmic framework and apply it to a number of different signaling problems. 

Also related to the Bayesian persuasion model is the extensive literature on cheap talk starting with 
Crawford and Sobel 111911 . Cheap talk can be viewed as the analogue of persuasion when the sender cannot 
commit to an information revelation policy. Nevertheless, the commitment assumption in persuasion has 
been justified on the grounds that it arises organically in repeated cheap talk interactions with a long horizon 
— in particular when the sender must balance his short term payoffs with long-term credibility. We refer 
the reader to the discussion of this phenomenon in 14311 . Also to this point, Kamenica and Gentzkow 13411 
mention that an earlier model of repeated 2-player games with asymmetric information by Aumann and 
Maschler lHh is mathematically analogous to Bayesian persuasion. 

Various recent models on selling information in are quite similar to Bayesian persuasion, with 

the main difference being that the sender’s utility function is replaced with revenue. Whereas Babaioff et al. 
101 consider the algorithmic question of selling information when states of nature are explicitly given as 
input, the analogous algorithmic questions to ours have not been considered in their model. We speculate 
that some of our algorithmic techniques might be applicable to models for selling information when the 
prior distribution on states of nature is represented succinctly. 

As discussed previously, our results involve exercising and generalizing ideas from prior work in Bayesian 
mechanism design. We view drawing these connections as one of the contributions of our paper. In Ap¬ 
pendix 13 we discuss these connections and differences at length. 


2 Preliminaries 


In a persuasion game, there are two players: a sender and a receiver. The receiver is faced with selecting 
an action from [n] = n}, with an a-priori-unknown payoff to each of the sender and receiver. We 

assume payoffs are a function of an unknown state of nature 9, drawn from an abstract set 0 of potential 
realizations of nature. Specifically, the sender and receiver’s payoffs are functions s, r : 0 x [n] ^ M, 
respectively. We use r = r{6) G to denote the receiver’s payoff vector as a function of the state 
of nature, where ri{9) is the receiver’s payoff if he takes action i and the state of nature is 9. Similarly 
s = s{9) € denotes the sender’s payoff vector, and Si{9) is the sender’s payoff if the receiver takes 

action i and the state is 9. Without loss of generality, we often conflate the abstract set 0 indexing states of 
nature with the set of realizable payoff vector pairs (s, r) — i.e., we think of 0 as a subset of M” x M”. We 
assume that 0 is finite for notational convenience, though this is not needed for our results in Section [5] 

In Bayesian persuasion, it is assumed that the state of nature is a-priori unknown to the receiver, and 
drawn from a common-knowledge prior distribution A supported on 0. The sender, on the other hand, has 
access to the realization of 9, and can commit to a policy of partially revealing information regarding its 
realization before the receiver selects his action. Specifically, the sender commits to a signaling scheme p, 
mapping (possibly randomly) states of nature 0 to a family of signals S. For 0 G 0, we use ip{9) to denote 
the (possibly random) signal selected when the state of nature is 9. Moreover, we use ip{9, a) to denote the 
probability of selecting the signal cr given a state of nature 9. An algorithm implements a signaling scheme 
ip if it takes as input a state of nature 9, and samples the random variable p{9). 

Given a signaling scheme p with signals S, each signal cr G S is realized with probability = 
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cr). Conditioned on the signal a, the expected payoffs to the receiver of the various actions 
are summarized by the vector r{a) = ^ o')r{6). Similarly, the sender’s payoff as a function 

of the receiver’s action are summarized by s((j) = ^ YleeB a)s{6). On receiving a signal cr, the 

receiver performs a Bayesian update and selects an action i*(cr) € argmaxj rj(cr) with expected receiver 
utility maxj ri{a). This induces utility Sj*(o-)(cr) for the sender. In the event of ties when selecting i*{a), 
we assume those ties are broken in favor of the sender. 

We adopt the perspective of a sender looking to design ip to maximize her expected utility (f^)> 

in which case we say ip is optimal. When ip yields expected sender utility within an additive [multiplicative] 
e of the best possible, we say it is e-optimal [e-approximate] in the additive [multiplicative] sense. A simple 
revelation-principle style argument 13411 shows that an optimal signaling scheme need not use more than n 
signals, with one recommending each action. Such a direct scheme ip has signals S = {cji,..., cr„}, and 
satisfies ri{ai) > rj{ai) for all i,j G [n]. We think of cjj as a signal recommending action i, and the require¬ 
ment rj(crj) > maxj rj{ai) as an incentive-compatibility (IC) constraint on our signaling scheme. We can 
now write the sender’s optimization problem as the following LP with variables {ip{6, cjj) : 0 G 0, z G [n]}. 


maximize X^eee ^"=1 cri)si{9) 

subject to X^"^^ ip{9, (jj) = 1, for 6» G 0. 

‘fi9,cri)>0, for 0 G 0, f G [n]. 

For our results in Section [51 we relax our incentive constraints by assuming that the receiver follows the 
recommendation so long as it approximately maximizes his utility — for a parameter e > 0, we relax our re¬ 
quirement to ri{ai) > maxj rj{ai)—e, which translates to the relaxed IC constraints Xeee ^e‘^{9, o'i)ri[9) > 
Xee© ~ f) LP ([T|l- We call such schemes e-incentive compatible (e-IC). We judge the 

suboptimality of an e-IC scheme relative to the best (absolutely) IC scheme; i.e., in a bi-criteria sense. 

Finally, we note that expected utilities, incentive compatibility, and optimality are properties not only 
of a signaling scheme tp, but also of the distribution A over its inputs. When A is not clear from context 
and ip is supported on a superset of A, we often say that a signaling scheme p is IC [e-IC] for A, or optimal 
[e-optimal] for A. We also use Us{p, A) to denote the expected sender utility Xee© X]r=i '^i)si{9). 

3 Persuasion with I.I.D. Actions 

In this section, we assume the payoffs of different actions are independently and identically distributed (i.i.d.) 
according to an explicitly-described marginal distribution. Specifically, each sfafe of nafure 6* is a vecfor in 
0 = [m]” for a paramefer m, where 9i G [m] is fhe type of action i. Associated wifh each fype j G [m] is 
a pair {Cj, Pj) £ where [p^] is fhe payoff fo fhe sender [receiver] when fhe receiver chooses an action 
wifh fype j. We are given a marginal disfribufion over fypes, described by a vecfor q = {qi, ..., Qm) G A^- 
We assume each action’s fype is drawn independenfly according fo q\ specifically, fhe prior disfribufion A 
on sfafes of nafure is given by A(6*) = Oieln] Lor convenience, we lef ^ = ((^i, S 1^”* and 

p = (pi, ...,pm) £ denote fhe fype-indexed vectors of sender and receiver payoffs, respectively. We 
assume p, and q — fhe paramefers describing an i.i.d. persuasion insfance — are given explicifly. 

Note fhaf fhe number of sfafes of nafure is rri^, and fherefore fhe nafural represenfafion of a signaling 
scheme has nrn^ variables. Moreover, fhe nafural linear program for fhe persuasion problem in Section |2| 
has an exponential in n number of bofh variables and consfrainfs. Neverfheless, as menfioned in Seclion|2| 
we seek only fo implemenf an opfimal or near-opfimal scheme p as an oracle which fakes as inpuf 9 and 
samples a signal a ~ p{9). Our algorifhms will run in time polynomial in n and m, and will optimize over 
a space of succincf “reduced forms” for signaling schemes which we term signatures, fo be described nexf. 

For a sfafe of nafure 9, define fhe mafrix G {0,1}"^™ so fhaf = 1 if and only if action i has fype 

j in 9 (i.e. 9i = j). Given an i.i.d prior A and a signaling scheme p wifh signals S = {cJi,..., cr^}, for each 
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M<^i = X[0)ip[9^ ai)M^, for i = 1,..., n. 

Er=i = 1 > for(9G0. 

</ 2 ( 6 *,crj) > 0 , for 0 G 0 ,i G [n]. 

Figure 1: Realizable Signatures V 


max E7=i^-Mp 

s.t. p ■ Mp > p ■ Mp, for i,j G [n]. 
G V 

Figure 2: Persuasion in Signature Space 


i G [n] let a* = Yhe ^i) denote the probability of sending fjj, and let M®"* = \{6)ip{9, ai)M^. 

Note that Mp is the joint probability that action j has type k and the scheme outputs fjj. Also note that each 
row of sums to a*, and the jth row represents the un-normalized posterior type distribution of action j 
given signal ctj. We call Ad = G the signature of (p. The sender’s objective and 

receiver’s IC constraints can both be expressed in terms of the signature. In particular, using Mj to denote 
the jth row of a matrix M, the IC constraints are p ■ Mp > p ■ Mp for all i, j G [n], and the sender’s 
expected utility assuming the receiver follows the scheme’s recommendations is ^ • Mp. 

We say Ad = (M'^i, M^^ ) G is realizable if there exists a signaling scheme p with Ad as 

its signature. Realizable signatures constitutes a polytope V C which has an exponential-sized 

extended formulation as shown Figure[T] Given this characterization, the sender’s optimization problem can 
be written as a linear program in the space of signatures, shown in Figure |2l 

3.1 Symmetry of the Optimal Signaling Scheme 

We now show that there always exists a “symmetric” optimal scheme when actions are i.i.d. Given a signa¬ 
ture Ad = ,..., M^”), it will sometimes be convenient to think of it as the set of pairs cji)}jg[„]. 

Definition 3.1. A signaling scheme p with signature {{M ^^, crj)}jg[„] is symmetric if there exist x,y £ 
such that Mp = xfor all i G [n] and Mj^ = y for all j i. The pair (x, y) is the s-signature ofp. 

In other words, a symmetric signaling scheme sends each signal with equal probability ||a;||i, and in¬ 
duces only two different posterior type distributions for actions: for the recommended action, and 

for the others. We call {x, y) realizable if there exists a signaling scheme with {x, y) as its s-signature. The 
family of realizable s-signatures constitutes a polytope Vs, and has an extended formulation by adding the 
variables x,y £ and constraints Mp = x and Mp = y for all i,j £ [n] with i p j io the extended 
formulation of (asymmetric) realizable signatures from Figure [T] 

We make two simple observations regarding realizable s-signatures. First, ||a;||i = ||y||i = - for 
each {x,y) £ Vg, and this is because both ||a;||i and ||y||i equal the probability of each of the n signals. 
Second, since the signature must be consistent with prior marginal distribution q, we have x + (n — l)y = 
Mp = q. We show that restricting to symmetric signaling schemes is without loss of generality. 

Theorem 3.2. When the action payoffs are i.i.d., there exists an optimal and incentive-compatible signaling 
scheme which is symmetric. 

Theorem 13.21 is proved in Appendix IB.11 At a high level, we show that optimal signaling schemes are 
closed with respect to two operations: convex combination and permutation. Specifically, a convex combi¬ 
nation of realizable signatures — viewed as vectors in M^^xmxn — realized by the corresponding “random 
mixture” of signaling schemes, and this operation preserves optimality. The proof of this fact follows easily 
from the fact that linear program in Figure |2] has a convex family of optimal solutions. Moreover, given a 
permutation vr G and an optimal signature Ad = (Tj)}jg[„] realized by signaling scheme p, the 

“permuted” signature 7 r(Ad) = {(vrM'^% (Tjr(i))}*£[„] — where premultiplication of a matrix by vr denotes 
permuting the rows of the matrix — is realized by the “permuted” scheme pTr{9) = 7 r(<^( 7 r“^( 0 ))), which 
is also optimal. The proof of this fact follows from the “symmetry” of the (i.i.d.) prior distribution about the 
different actions. Theorem l3.2l is then proved constructively as follows: given a realizable optimal signature 
Ad, the “symmetrized” signature Ad = ;^ Z^TreSn ^(Ad) is realizable, optimal, and symmetric. 
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3.2 Implementing the Optimal Signaling Scheme 

We now exhibit a polynomial-time algorithm for persuasion in the i.i.d. model. Theorem 13.21 permits re¬ 
writing the optimization problem in Figure |2] as follows, with variables x,y ^ 


maximize • x 

subject to p ■ X > p ■y (2) 

{x,y) e Vs 


Problem ^ cannot be solved directly, since Vs is defined by an extended formulation with exponentially 
many variables and consttaints, as described in Section 13.11 Nevertheless, we make use of a connection 
between symmetric signaling schemes and single-item auctions with i.i.d. bidders to solve (jj]) using the 
Ellipsoid method. Specifically, we show a one-fo-one correspondence befween symmefric signafures and (a 
subsef of) symmefric reduced forms of single-ifem auctions wifh i.i.d. bidders, defined as follows. 


Definition 3.3 f jlSll i. Consider a single-item auction setting with n i.i.d. bidders and m types for each 
bidder, where each bidder’s type is distributed according to q £ A^- An allocafion rule is a randomized 
function A mapping a type profile 0 £ [rrif' to a winner A{9) £ [n] U {*}, where * denotes not allocating 
the item. Wfe say the allocation rule has symmefric reduced form r £ [0,1]"* if for each bidder i £ [n] and 
type j £ [m], Tj is the conditional probability ofi receiving the item given she has type j. 


When q is clear from confexf, we say r is realizable if fhere exisfs an allocafion rule wifh r as ifs symmefric 
reduced form. We say an algorifhm implements an allocafion rule A if if fakes as inpuf 9, and samples A{9). 


Theorem 3.4. Consider the Bayesian Persuasion problem with n i.i.d. actions and m types, with parameters 
q £ Am, $ £ and p £ M"* given explicitly. An optimal and incentive-compatible signaling scheme can 
be implemented in poly(m, n) time. 

Theorem (33 is a consequence of fhe following sef of lemmas. 


Lemma 3.5. Let {x,y) £ [0,1]™' x [0,1]™, and define r = ..., |^). The pair {x,y) is a realizable 

s-signature if and only if (a) ||a;||i = 2, (b) x -£ {n — l)y = q, and (c) t is a realizable symmetric reduced 
form of an allocation rule with n i.i.d. bidders, m types, and type distribution q. Moreover, assuming x and 
y satisfy (a), (b) and (c), and given black-box access to an allocation rule A with symmetric reduced form 
T, a signaling scheme with s-signature (x, y) can be implemented in poly(n, m) time. 


Lemma 3.6. An optimal realizable s-signature, as described by LP is computable in poly(n, m) time. 

Lemma 3.7. (See [iR Q/j Consider a single-item auction setting with n i.i.d. bidders and m types for each 
bidder, where each bidder’s type is distributed according to q £ Am. Given a realizable symmetric reduced 
form T £ [0,1]'", an allocation rule with reduced form r can be implemented in poly(n, m) time. 


The proofs of Lemmas 13.51 and 13.61 can be found in Appendix IB.2I The proof of Lemma 13.51 builds 
a correspondence befween s-signafures of signaling schemes and cerfain reduced-form allocation rules. 
Specifically, acfions correspond fo bidders, action fypes correspond fo bidder fypes, and signaling ai cor¬ 
responds fo assigning fhe ifem fo bidder i. The expression of fhe reduced form in ferms of fhe s-signafure 
fhen follows from Bayes’ rule. Lemma lT6] follows from Lemma 133 t he eUipsoid method, and the fact that 
symmetric reduced forms admit an efficient separation oracle fsee |[l3.12, 15, J]). 
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Algorithm 1 Independent Signaling Scheme 

Input: Sender payoff vector receiver payoff vector p, prior distribution q 
Input: State of nature 9 G [m]" 

Output: An n-dimensional binary signal a G {HIGH, LOW}” 

1: Compute an optimal solution {x*, y*) linear program ([3ll. 

2: For each action i independently, set component signal o* to HIGH with probability —^ and to LOW 

<iei 

Otherwise, where 9i is the type of action i in the input state 9. 

3: Return fj = (oi,On). 

3.3 A Simple (1 — i) -Approximate Scheme 

Our next result is a “simple” signaling scheme which obtains a (1 — 1/e) multiplicative approximation when 
payoffs are nonnegative. This algorithm has the distinctive property that it signals independently for each 
action, and therefore implies that approximately optimal persuasion can be parallelized among multiple 
colluding senders, each of whom only has access to the type of one or more of the actions. 

Recall from Section [3T] that an s-signature {x,y) satisfies ||a;||i = ||r/||i = - and x + {n — l)y = q. 
Our simple scheme, shown in Algorithm [TJ works with the following explicit linear programming relaxation 
of optimization problem ([21) . 

maximize ■ x 
subject to p ■ X > p ■ y 

x + {n-l)y = q (3) 

ll^lli = ^ 

x,y>0 

Algorithm [U has a simple and instructive interpretation. It computes the optimal solution {x*,y*) to 
the relaxed problem Q, and uses this solution as a guide for signaling independently for each action. The 
algorithm selects, independently for each action i, a component signal Oj G {HIGH, LOW}. In particular, 
each Oi is chosen so that Pr[oj = HIGH] = -, and moreover the events o* = HIGH and Oi = LOW 
induce the posterior beliefs nx* and ny*, respectively, regarding the type of action i. 

The signaling scheme implemented by Algorithm [T] approximately matches the optimal value of (l3]l, 
as shown in Theorem 13.81 assuming the receiver is rational and therefore selects an action with a HIGH 
component signal if one exists. We note that the scheme of Algorithm [TJ while not a direct scheme as 
described, can easily be converted into one; specifically, by recommending an action whose component sig¬ 
nal is HIGH when one exists (breaking ties arbitrarily), and recommending an arbitrary action otherwise. 
Theorem 13.81 follows from the fact that {x*,y*) is an optimal solution to LP (l3]l, the fact that the posterior 
type distribution of an action i is nx* when o* = HIGH and ny* when Oj = LOW, and the fact that each 
component signal is high independently with probability We defer the formal proof to Appendix IB. 3 1 

Theorem 3.8. Algorithm\I\runs in poly{m, n) time, and serves as a {1 — ^)-approximate signaling scheme 
for the Bayesian Persuasion problem with n i.i.d. actions, m types, and nonnegative payoffs. 

Remark 3.9. Algorithm [7] signals independently for each action. This conveys an interesting conceptual 
message. That is, even though the optimal signaling scheme might induce posterior beliefs which correlate 
different actions, it is nevertheless true that signaling for each action independently yields an approximately 
optimal signaling scheme. As a consequence, collaborative persuasion by multiple parties (the senders), 
each of whom observes the payoff of one or more actions, is a task that can be parallelized, requiring no 
coordination when actions are identical and independent and only an approximate solution is sought. Vfe 







leave open the question of whether this is possible when action payoffs are independently but not identically 
distributed. 


4 Complexity Barriers to Persuasion with Independent Actions 

In this section, we consider optimal persuasion with independent action payoffs as in Section [3l albeit 
with action-specific marginal distributions given explicitly. Specifically, for each action i we are given 
a distribution g* G A^i on rrii types, and each type j G [nii] of action i is associated with a sender 
payoff G M and a receiver payoff p*- G M. The positive results of Section |3] draw a connection between 
optimal persuasion in the special case of identically distributed actions and Border’s characterization of 
reduced-form single-item auctions with i.i.d. bidders. One might expect this connection to generalize to the 
independent non-identical persuasion setting, since Border’s theorem extends to single-item auctions with 
independent non-identical bidders. Surprisingly, we show that this analogy to Border’s characterization fails 
to generalize. We prove the following theorem. 


Theorem 4.1. Consider the Bayesian Persuasion problem with independent actions, with action-specific 
payoff distributions given explicitly. It is ffP-hard to compute the optimal expected sender utility. 


Invoking the framework of Gopalan et al. S, this rules out a generalized Border’s theorem for our 
setting, in the sense defined by unless the polynomial hierarchy collapses to . We view this result 
as illustrating some of the important differences between persuasion and mechanism design. 

The proof of Theorem 14. II is rather involved. We defer the full proof to Appendix O and only present a 
sketch here. Our proof starts from the ideas of Gopalan et al. OOll , who show the #P-hardness for revenue or 
welfare maximization in several mechanism design problems. In one case, s reduce from the ^^P-hard 
problem of computing the Khintchine constant of a vector. Our reduction also starts from this problem, but 
is much more involved!! First, we exhibit a polytope which we term the Khintchine polytope, and show that 
computing the Khintchine constant reduces to linear optimization over the Khintchine polytope. Second, 
we present a reduction from the membership problem for the Khintchine polytope to the computation of 
optimal sender utility in a particularly-crafted instance of persuasion with independent actions. Invoking the 
polynomial-time equivalence between membership checking and optimization (see, e.g.. Bill '), we conclude 
the #P-hai‘dness of our problem. The main technical challenge we overcome is in the second step of our 
proof: given a point x which may or may not be in the Khintchine polytope K, we construct a persuasion 
instance and a threshold T so that points in K, encode signaling schemes, and the optimal sender utility is at 
least T if and only if x € K, and the scheme corresponding to x results in sender utility T. 


Proof Sketch of Theorem 14.11 

The Khintchine problem, shown to be #P-hard in lli^ . is to compute the Khintchine constant K{a) of a 
given vector a G M”, defined as K[a) = [|0 • a|] where 6 is drawn uniformly at random from 

{±1}"^. To relate the Khintchine problem to Bayesian persuasion, we begin with a persuasion instance with 
n i.i.d. actions and two action types, which we refer to as type -1 and type +1. The state of nature is a uniform 
random draw from the set {±1}"', with the ith entry specifying the type of action i. We call this instance the 
Khintchine-like persuasion setting. As in Section |3l we still use the signature to capture the payoff-relevant 
features of a signaling scheme, but we pay special attention to signaling schemes which use only two signals, 
in which case we represent them using a two-signal signature of the form (M^, M^) G x The 

Khintchine polytope K{n) is then defined as fhe (convex) family of all realizable fwo-signal signafures for 
fhe Khinfchine-like persuasion problem wifh an addifional consfrainf: each signal is senf wifh probabilify 
exacfly We firsf prove fhaf general linear opfimizafion over K{n) is #P-hard by encoding compulation of 


^In m, Myerson’s characterization is used to show that optimal mechanism design in a public project setting directly encodes 
computation of the Khintchine constant. No analogous direct connection seems to hold here. 
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the Khintchine constant as linear optimization over /C(n). In this reduction, the optimal solution in /C(n) is 
the signature of the two-signal scheme ^p{9) = sign{9 ■ a), which signals -|- and — each with probability 

To reduce the membership problem for the Khintchine polytope to optimal Bayesian persuasion, the 
main challenges come from our restrictions on /C(n), namely to schemes with two signals which are equally 
probable. Our reduction incorporates three key ideas. T\ie. first is to design a persuasion instance in which 
the optimal signaling scheme uses only two signals. The instance we define will have n + 1 actions. Action 0 
is special - it deterministically results in sender utility e > 0 (small enough) and receiver utility 0. The other 
n actions are regular. Action i > 0 independently results in sender utility —a* and receiver utility a* with 
probability ^ (call this type Ij), or sender utility —bi and receiver utility hi with probability | (call this type 
2j), for Qi and bi to be set later. Note that the sender and receiver utilities are zero-sum for both types. Since 
the special action is deterministic and the probability of its (only) type is 1 in any signal, we can interpret 
any (M^, M^) € /C(n) as a two-signal signature for our persuasion instance (the row corresponding to the 
special action 0 is implied). We show that restricting to two-signal schemes is without loss of generality 
in this persuasion instance. The proof tracks the following intuition: due to the zero-sum nature of regular 
actions, any additional information regarding regular actions would benefit the receiver and harm the sender. 
Consequently, sender does not reveal any information which distinguishes between different regular actions. 
Formally, we prove that there always exists an optimal signaling scheme with only two signals: one signal 
recommends the special action, and the other recommends some regular action. 

We denote the signal that recommends the special action 0 by it+ (indicating that the sender derives 
positive utility e), and denote the other signal by (t_ (indicating that the sender derives negative utility, as 
we show). The second key idea concerns choosing appropriate values for for a given two- 

signature (M^, M^) to be tested. We choose these values to satisfy the following two properties: (1) For 
all regular actions, the signaling scheme implementing (M^,M^) (if it exists) results in the same sender 
utility —1 (thus receiver utility 1 ) conditioned on a- and the same sender utility 0 conditioned on (T+; ( 2 ) 
the maximum possible expected sender utility from a-, i.e., the sender utility conditioned on it_ multiplied 
by the probability of (T_, is — As a result of Property (1), if (M^, M^) G K,{n) then the corresponding 
signaling scheme ip is IC and results in expected sender utility ^ ~ 5 (since each signal is sent with 

probability i). Property (2) implies that (p results in the maximum possible expected sender utility from cr_. 

We now run into a challenge: the existence of a signaling scheme with expected sender utility T = ^e—| 
does not necessarily imply that (M^, M^) G /C(n) if e is large. Our third key idea is to set e > 0 “sufficiently 
small” so that any optimal signaling scheme must result in the maximum possible expected sender utility — ^ 
from signal (t_ (see Property (2) above). In other words, we must make e so small so that the sender prefers 
to not sacrifice any of her payoff from a- in order fo gain ufilify from fhe special acfion recommended by 
1 T 4 .. We show fhaf such an e exisfs wifh polynomially many bifs. We prove ifs exisfence by arguing fhaf 
fhe polyfope of incenfive-compafible fwo-signal signafures has polynomial bif complexify, and fherefore an 
e > 0 fhaf is smaller fhan fhe “bif complexify” of fhe vertices would suffice. 

As a resulf of fhis choice of e, if fhe opfimal sender ufilify is precisely '^ = ^ then we know that 

signal cr+ must be sent with probability ^ since the expected sender utility from signal cj_ must be — i. 
We show that this, together with the specifically consfrucfed {bi}f^i, is sufficienf fo guaranfee fhaf 

fhe opfimal signaling scheme musf implemenf fhe given fwo-signafure (M^, M^), i.e., (M^, M^) G /C(n). 
When fhe optimal optimal sender ufilify is sfricfly greafer fhan fhe opfimal signaling scheme does 

nol implemenf (M^, M^), buf we show fhaf if can be posf-processed info one fhaf does. 

5 The General Persuasion Problem 

We now fum our affenfion fo fhe Bayesian Persuasion problem when fhe payoffs of differenf actions are 
arbifraiily correlafed, and fhe join! disfribufion A is presenfed as a black-box sampling oracle. We assume 
fhaf payoffs are normalized fo lie in fhe bounded inferval, and prove essenfially mafching positive and 
negative resulfs. Our posifive resulf is a fully polynomial-fime approximafion scheme for opfimal persuasion 
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Algorithm 2 Signaling Scheme for a Black Box Distribution 

Parameter: e > 0 
Parameter: Integer K > 0 

Input: Prior distribution A supported on [—1,1]^”, given by a sampling oracle 
Input: State of nature 9 G [—1,1]^" 

Output: Signal cr G S, where S = {ai ,..., an}- 
1: Draw integer i uniformly at random from {1,..., it'}, and denote 9e = 9. 

2 : Sample 0i,..., 9i-i, ... ,9 k independently from A, and let the multiset A = {9i,..., 9 k} denote 

the empirical distribution augmented with the input state 9 = 9^. 

3: Solve lineal' program (IHl to obtain the signaling scheme ^ : A —)■ A(S). 

4: Output a sample from ^{9) = ^{9i). 


maximize Y.k=i Yh=i (Ti)si{9k) 

subject to Ya=i ^i^k, o-i) = 1, 

'K'^{(^k,(^i)f'iiSk) > X]fc=l 'K^{(^k^(^i){'>'j{(^k) 

^{9k,CFi) > 0 , 


for k G [K], 
e), fori,jG[n]. 

for k G [K\,i G [n]. 


( 4 ) 


Relaxed Empirical Optimal Signaling Problem 


with a bi-criteria guarantee; specifically, we achieve approximate optimality and approximate incentive 
compatibility in the additive sense described in Section |2l Our negative results show that such a bi-criteria 
loss is inevitable in the black box model for information-theoretic reasons. 

5.1 A Bicriteria FPTAS 


Theorem 5.1. Consider the Bayesian Persuasion problem in the black-box oracle model with n actions and 
payoffs in [—1,1], and let e > 0 be a parameter. An e-optimal and e-incentive compatible signaling scheme 
can be implemented in poly(n, 4) time. 


To prove Theorem I5.11 we show that a simple Monte-Carlo algorithm implements an approximately 
optimal and approximately incentive compatible scheme tp. Notably, our algorithm does not compute a 
representation of the entire signaling scheme p as in Section [3l but rather merely samples its output p{9) 
on a given input 9. At a high level, when given as input a state of nature 9, our algorithm first takes K = 
poly(n, 4) samples from the prior distribution A which, intuitively, serve to place the true state of nature 9 
in context. Then the algorithm uses a linear program to compute the optimal e-incentive compatible scheme 
ip for the empirical distribution of samples augmented with the input 9. Finally, the algorithm signals as 
suggested by pior 9. Details are in Algorithm |2j which we instantiate with e > 0 and K = [25^ iog(^)]. 

We note that relaxing incentive compatibility is necessary for convergence to the optimal sender utility 
— we prove this formally in Section 15.21 This is why LP ([Hi features relaxed incentive compatibility 
constraints. Instantiating Algorithm|2] with e = 0 results in an exactly incentive compatible scheme which 
could be far from the optimal sender utility for any finite number of samples K, as reflected in Lemma 15.41 

Theorem 15.11 follows from three lemmas pertaining to the scheme p implemented by Algorithm |2l Ap¬ 
proximate incentive compatibility for A fLemma l5^ follows from the principle of deferred decisions, lin¬ 
earity of expectations, and the fact that p is approximately incentive compatible for the augmented empirical 
distribution A. A similar argument, also based on the principal of deferred decisions and linearity of expec¬ 
tations, shows that the expected sender utility from our scheme when 0 ~ A equals the expected optimal 
value of linear program 0 , as stated in Lemma [531 Finally, we show in Lemma 1531 that the optimal value 
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of LP dill is close to the optimal sender utility for A with high probability, and hence also in expectation, 
when K = poly(n, -) is chosen appropriately; the proof of this fact invokes standard tail bounds as well 
as structural properties of linear program dH), and exploits the fact that LP dUl relaxes the incentive com¬ 
patibility constraint. We prove all three lemmas in Appendi x ID. II Even though our proof of Lemma [SA] is 
self-contained, we note that it can be shown to follow from 1451. Theorem 6] with some additional work. 


Lemma 5.2. Algorithm^implements an e-incentive compatible signaling scheme for prior distribution A. 

Lemma 5.3. Assume 0 ~ A, and assume the receiver follows the recommendations of Algorithm^ The 
expected sender utility equals the expected optimal value of the linear program dU) solved in Step [J] Both 
expectations are taken over the random input 6 as well as internal randomness and Monte-Carlo sampling 
performed by the algorithm. 


Lemma 5.4. Let OPT denote the expected sender utility induced by the optimal incentive compatible 
signaling scheme for distribution A. When Algorithm^is instantiated with K > log(^) and its input 
6 is drawn from A, the expected optimal value of the linear program dUl solved in Step\^is at least OPT — e. 
Expectation is over the random input 9 as well as the Monte-Carlo sampling performed by the algorithm. 


5.2 Information-Theoretic Barriers 

We now show that our bi-criteria FPTAS is close to the best we can hope for: there is no bounded-sample 
signaling scheme in the black box model which guarantees incentive compatibility and c-optimality for any 
constant c < 1 , nor is there such an algorithm which guarantees optimality and c-incentive compatibility for 
any c < |. Formally, we consider algorithms which implement direct signaling schemes. Such an algorithm 
takes as input a black-box distribution A supported on [— 1 , 1 ]^” and a state of nature 9 G [—1,1]^”, where n 
is the number of actions, and outputs a signal a G {ui,..., an} recommending an action. We say such an 
algorithm is e-incentive compatible [e-optimal] if for every distribution A the signaling scheme Al(A) is e- 
incentive compatible [e-optimal] for A. We define the sample complexity 9) as the expected number 

of queries made by A to the blackbox given inputs A and 9, where expectation is taken the randomness 
inherent in the Monte-Carlo sampling from A as well as any other internal coins of A. We show that the 
worst-case sample complexity is not bounded by any function of n and the approximation parameters unless 
we allow bi-criteria loss in both optimality and incentive compatibility. More so, we show a stronger negative 
result for exactly incentive compatible algorithms: the average sample complexity over 0 ~ A is also not 
bounded by a function of n and the suboptimality parameter. Whereas our results imply that we should 
give up on exact incentive compatibility, we leave open the question of whether an optimal and e-incentive 
compatible algorithm exists with poly(n, i) average case (but unbounded worst-case) sample complexity. 

Theorem 5.5. The following hold for every algorithm A for Bayesian Persuasion in the black-box model: 

(a) If A is incentive compatible and c-optimal for c < 1, then for every integer K there is a distribution 
A = X{K) on 2 actions and 2 states of nature such that E 0 ..^;^[S'C'^(A, 0)] > K. 

(b) If A is optimal and c-incentive compatible for c < \, then for every integer K there is a distribution 
A = X{K) on 3 actions and 3 states of nature, and 9 in the support ofX, such that SCj^{X, 9) > K. 

Our proof of each part of this theorem involves constructing a pair of distributions A and X' which are 
arbitrarily close in statistical distance, but with the property that any algorithm with the postulated guarantees 
must distinguish between A and A'. We defer the proof to Appendix ID.21 
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A Additional Discussion of Connections to Bayesian Mechanism Design 

Section [3l which considers persuasion with independent and identically-distributed actions, relates to two 
ideas from auction theory. First, our symmetrization result in Section I3d1 is similar to that of Daskalakis and 
Weinberg lEoll . but involves an additional ingredient which is necessary in our case: not only is the posterior 
type distribution for a recommended action (the winning bidder in the auction analogy) independent of the 
identity of the action, but so is the posterior type distribution of an unrecommended action (losing bidder). 
Second, our algorithm for computing the optimal scheme in Section involves a connection to Border’s 
characterization of the space of feasible reduced-form single-item auctions El I 2 I 1 . as well as its algorithmic 
properties lE. flll. However, unlike in the case of single-item auctions, this connection hinges crucially on 
the symmetries of the optimal scheme, and fails to generalize to the case of persuasion with independent 
non-identical actions (analogous to independent non-identical bidders) as we show in Section IH We view 
this as evidence that persuasion and auction design — while bearing similarities and technical connections 
— are importantly different. 

Section |4] shows that our Border’s theorem-based approach in Section [3] can not be extended to in¬ 
dependent non-identical actions. Our starting point are the results of Gopalan et al. Oflil . who rule out 
Border’s-theorem like characterizations for a number of mechanism design settings by showing the #P- 
hardness of computing the maximum expected revenue or welfare. Our results similarly show that it is #P 
hard to compute the maximum expected sender utility, but our reduction is much more involved. Specifi¬ 
cally, whereas we also reduce from the #P-hard problem of computing the Khintchine constant of a vector, 
unlike in our reduction must go through the membership problem of a polytope which we use to en¬ 
code the Khintchine constant computation. This detour seems unavoidable due to the different nature of 
the incentive-compatibility constraints placed on a signaling scheme^ Specifically, we present an intricate 
reduction from membership testing in this “Khintchine polytope” to an optimal persuasion problem with 
independent actions. 

Our algorithmic result for the black box model in Section |5] draws inspiration from, and is technically 
related to, the work in 11151 . 1 il l Id 14511 on algorithmically efficient mechanisms for multi-dimensional settings. 
Specifically, an alternative algorithm for our problem can be derived using the framework of reduced forms 
and virtual welfare of Cai et al. 116] with significant additional work0 For this, a different reduced form 
is needed which allows for an unbounded “type space”, and maintains the correlation information across 
actions necessary for evaluating the persuasion notion of incentive compatibility, which is importantly dif¬ 
ferent from incentive compatibility in mechanism design. Such a reduced form exists, and the resulting 
algorithm is complex and invokes the ellipsoid algorithm as a subroutine. The algorithm we present here is 
much simpler and more efficient both in terms of runtime and samples from the distribution A over states 
of nature, with the main computational step being a single explicit linear program which solves for the op¬ 
timal signaling scheme on a sample A from A. The analysis of our algorithm is also more straightforward. 
This is possible in our setting due to our different notion of incentive compatibility, which permits reduc¬ 
ing incentive compatibility on A to incentive compatibility on the sample A using the principle of deferred 
decisions. 


^In lli, Myerson’s characterization is used to show that optimal mechanism design in a public project setting directly encodes 
computation of the Khintchine constant. No analogous direct connection seems to hold here. 

"'We thank an anonymous reviewer for pointing out this connection. 
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B Omissions from Section 

B.l Symmetry of the Optimal Scheme (Theorem 13.21) 

To prove Theorem 13.21 we need two closure properties of optimal signaling schemes — with respect to 
permutations and convex combinations. We use vr to denote a permutation of [n], and let E>n denote the set of 
all such permutations. We define the permutation 7r{9) of a state of nature 9 G [m]^ so that ( 7 r( 0 ))j = 9„(^jj, 
and similarly the permutation of a signal Uj so that 7 r((Tj) = Given a signature Ai = 
we define fhe permufed signafure ^{M.) = 7 r((Ti))}jg[„], where vrM denofes applying permufafion 

TT fo fhe rows of a mafrix M. 

Lemma B.l. Assume the action payoffs are i.i.d., and let tt §n be an arbitrary permutation. If Ai is 
the signature of a signaling scheme p, then 7r(Ad) is the signature of the scheme defined by p-w{9) = 
Tr{p{Tr~^{9))). Moreover, ifp is incentive compatible and optimal, then so is p^^. 

Proof Lef Ad = be fhe signafure of p, as given in fhe sfafemenf of fhe lemma. We firsf 

show fhaf 7r(Af) = 7r(cr))}^gj. is realizable as fhe signafure of fhe scheme By definition, if 

suffices fo show fhaf \{9)pt^{ 9, TT{a))M^ = ttM'^ for an arbifrary signal 7r(cr). 


0 6 

(by definition of pA 

= 'K'^\{9)p{-K~^{9),a){-K~^M^) 

(by linearify of permufafion) 

eee 


= TT X{9)p{t:~^ {9), a)M'^ 


eee 


= TT {9))p{t:~^ {9), a)M'" 

(Since A is i.i.d.) 

eee 


= 7r A(0'V(0',a)M®' 

(by renaming 7 r“^( 0 ) fo 9') 

e'G0 


= -kM^ 

(by definifion of M^) 


Now, assuming p is incentive compatible, we check fhat Pt^ is incentive compatible by verifying fhe 
relevanf inequalify for ifs signafure. 

p • - p • =p-Mp-p-Mp>0 

Moreover, we show fhaf fhe sender’s ufilify is fhe same for p and p^^, completing fhe proof. 


□ 

Lemma B.2. Let t G [0,1]. If A = , ■ ■ ■, is the signature of scheme pA, and B = j ■ ■ ■, 

is the signature of a scheme pB, then their convex combination C = , ..., C"^") with = tA^^ + 

(1 — t)B^^ is the signature of the scheme pc which, on input 9, outputs pA{b) with probability t and PB{b) 
with probability 1 — t. Moreover, ifpA and pB are both optimal and incentive compatible, then so is pc- 

Proof This follows almosf immediafely from fhe facf fhaf fhe opfimizafion problem in Figure |2] is a linear 
program, wifh a convex feasible sef and a convex family of optimal solutions. We omif fhe slraighlforward 
defails. □ 
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Proof of Theorem 13.21 

Given an optimal and incentive compatible signaling scheme y? with signature {(M'^*, ) jig [n] > we show the 

existence of a symmetric optimal and incentive-compatible scheme of the form in Definition l3.ll According 
to Lemma lB.il for vr G the signature {(7rM'^% 7r((Ti))}jg[„] — equivalently written as , (7i}i^\ri\ 

— corresponds to the optimal incentive compatible scheme Invoking Lemma IB^ the signature 

also corresponds to an optimal and incentive compatible scheme, namely the scheme which draws a permu¬ 
tation TT uniformly at random, then signals according to 

Observe that the ith row of the matrix is the (i)th row of the matrix . Expressing 

as a sum over permutations vr G S^, and grouping the sum hy k = 7r“^(i), we can write 


Af* = 


1 y; 


(01 


n\ 


1 




4 V 

Tjl TT ^(*) 

7re§„ 


n 

• |{vrGSn :7r-i(i) = A:}| 

■ k=l 
1 "" 

- Y, Ml- . (n - 1 )! 


k=l 


n 




k=l 


which does not depend on i. Similarly, the jth row of the matrix Lo is the vr ^(j)th row of the 

matrix For j ^ i, expressing AJ* as a sum over permutations vr G S,i, and grouping the sum by 

k = and I = 7r~^{j), we can write 


1 


= ^7 E I-"’-- 


(01 


n! 


T^Sn 


E O 

M 

TT 


-MO 

HA 


^{l) = k,TT ^(j)=/}| 


n! 


k^l 


n(n — 1) 

which does not depend on i or j. Let 


k^l 

HttEK*. 


k^l 


X = 


y = 


1 " 

77 EK*; 

E""' 


fc=l 

1 


n{n — 1) 


k^l 
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The signature {{A^\ ^i)}i£[n] therefore describes an optimal, incentive compatible, and symmetric scheme 
with s-signature {x,y). 

B.2 The Optimal Scheme 

Proof of Lemma l33] 

For the “only if’ direction, ||a;||i = - and x + {n — l)y = q were established in Section [3Tl To show that 
r is a realizable symmetric reduced form for an allocation rule, let (/? be a signaling scheme with s-signature 
{x,y). Recall from the definition of an s-signature that, for each i € [n], signal ctj has probability 1/re, 
and nx is the posterior distribution of action fs type conditioned on signal ctj. Now consider the following 
allocation rule: Given a type profile 6 G [m\^ of fhe re bidders, allocate the item to bidder i with probability 
^p{9, ai) for any z G [re]. By Bayes rule. 


Pr[z gets item I z has type j] = Pr[z has type j\i gets item] • 


Pr[z gets item] 
Pr[z has type j] 


1/re 

= nxj ■ - 


(ij 


Therefore r is indeed the reduced form of the described allocation rule. 

For the “if” direction, let r, x, and y be as in the statement of the lemma, and consider an allocation 
rule A with symmetric reduced form r. Observe that A always allocates the item, since for each player 
z G [re] we have Pr[z gets the item] = Qj^j = — h- define the direct signaling scheme 

(fA by ipA{9) = o'A{e)- Let M. = ..., be the signature of ip a- Recall that, for 0 ~ A and 

arbitrary i G [re] and j G [rre], is the probability that (Pa{9) = and 9i = j; by definition, this equals 
the probability that A allocates the item to player z and her type is j, which is Tjqj = Xj. As a result, the 
signature Ai of ipA satisfies Mp = x for every action z. If ipA were symmetric, we would conclude that 
its s-signature is {x,y) since every s-signature {x,y') must satisfy a; -|- (re — l)y' = q (see Section ITTI) . 
However, this is not guaranteed when the allocation rule A exhibits some asymmetry. Nevertheless, ipA can 
be “symmetrized” into a signaling scheme ip'^ which first draws a random permutation vr G S„, and signals 
7r{ipA{'n'~^{9))). That tp'j^ has s-signature {x,y) follows a similar argument to that used in the proof of 
Theorem l3.21 and we therefore omit the details here. 

Finally, observe that the description of above is constructive assuming black-box access to A, with 
runtime overhead that is polynomial in re and m. 

Proof of Lemma l3^ 

By Lemma [T5l we can re-write LP ^ as follows: 


maximize 
subject to 


re^ • X 

px> py 

X + {n - l)y = q 


ll^lli = H 

\ qi ■ 


(5) 


n 

1^) is a realizable symmetric reduced form 


From iflslfiHflsl flll. we know that the family of all the realizable symmetric reduced forms constitutes 
a polytope, and moreover that this polytope admits an efficient separation oracle. The runtime of this oracle 
is polynomial in m and re, and as a result the above linear program can be solved in poly{n, m) time using 
the Ellipsoid method. 


19 









B.3 A Simple (1 — 1/e)-approximate Scheme 

Proof of Theorem l3.8l 

Given a binary signal a = (oi, ..., On) S {HIGH, LOW}"^, the posterior type distribution for an action 
equals nx* if the corresponding component signal is HIGH, and equals ny* if the component signal is 
LOW. This is simply a consequence of the independence of the action types, the fact that the different 
component signals are chosen independently, and Bayes’ rule. The constraint p ■ x* > p ■ y* implies 
that the receiver prefers actions i for which Oj = HIGH, any one of which induces an expected utility of 
np ■ X* for the receiver and ■ x* for the sender. The latter quantity matches the optimal value of LP (l3]l. 
The constraint I |a;| h = - implies that each component signal is HIGH with probability -, independently. 
Therefore, the probability that at least one component signal is HIGH equals 1 — (1 — > 1 — Since 

payoffs are nonnegative, and since a rational receiver selects a HIGH action when one is available, the 
sender’s overall expected utility is at least a 1 — ^ fraction of the optimal value of LP (l3]l. 
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C Proof of Theorem 14.11 


This section is devoted to proving Theorem 14. 1 1 Our proof starts from the ideas of Gopalan et al. OOn . 
who show the #P-hardness for revenue or welfare maximization in several mechanism design problems. In 
one case, BOl] reduce from the ^^P-hard problem of computing the Khintchine constant of a vector. Our 
reduction also starts from this problem, but is much more involved: First, we exhibit a polytope which we 
term Khintchine polytope, and show that computing the Khintchine constant reduces to linear optimization 
over the Khintchine polytope. Second, we present a reduction from the membership problem for the Khint¬ 
chine polytope to the computation of optimal sender utility in a paiticularly-crafted instance of persuasion 
with independent actions. Invoking the polynomial-time equivalence between membership checking and 
optimization (see, e.g., Oin i. we conclude the #P-hardness of our problem. The main technical challenge 
we overcome is in the second step of our proof: given a point x which may or may not be in the Khint¬ 
chine polytope /C, we construct a persuasion instance and a threshold T so that points in /C encode signaling 
schemes, and the optimal sender utility is at least T if and only if x G /C and the scheme corresponding to x 
results in sender utility T. 

The Khintchine Polytope 


We start by defining the Khintchine problem, which is shown to be #P-hard in l30n . 


Definition C.l. (Khintchine Problem) Given a vector a G M”, compute the Khintchine constant K(a) of a, 
defined as follows: 

K{a)= E [|0-a|], 

where 9 is drawn uniformly at random from {±1}”. 


To relate the Khintchine problem to Bayesian persuasion, we begin with a persuasion instance with n 
i.i.d. actions. Moreover, there are only two action typesH which we refer to as type -1 and type +1. The 
state of nature is a uniform random draw from the set {±1}”, with the zth entry specifying the type of 
action i. It is easy to see that these actions are i.i.d., with marginal probability | for each type. We call this 
instance the Khintchine-like persuasion setting. As in Section |3l we still use the signature to capture the 
payoff-relevant features of a signaling scheme. A signature for the Khintchine-like persuasion problem is 
of the form A4 = (M^, ..., M”) where M* G for any i G [n]. We pay special attention to signaling 
schemes which use only two signals, in which case we represent them using a two-signal signature of the 
form G x Recall that such a signature is realizable if there is a signaling scheme 

which uses only two signals, with the property that is the joint probability of the fth signal and the event 
that action j has type t. We now define fhe Khintchine polytope, consisting of a convex family of fwo-signal 
signafures. 

Definition C.2. The Khinfchine polyfope is the family K,{n) of realizable two-signal signatures (M^ ,M‘^) 
for the Khintchine-like persuasion setting which satisfy the additional constraints -\- M /2 = ^ Vi G [n]. 

We somefimes use K, fo denofe fhe Khinfchine polyfope lC{n) when fhe dimension n is clear from fhe 
confexf. Nofe fhaf fhe consfrainfs -\- M -^2 = ^ [^] fhaf fhe firsf signal should be senf wifh 

probabilify ^ (hence also fhe second signal). We now show fhaf opfimizing over fhe Khinfchine polyfope is 
#P-haid by reducing fhe Kinfchine problem fo Linear program 


Lemma C.3. General linear optimization over the Khintchine polytope 1C is ^P-hard. 

^Recall from Section[^that each type is associated with a pair (^, p), where ^ [p] is the payoff to the sender [receiver] if the 
receiver takes an action of that type. 


21 









maximize YJl=i “ ^t-i) “ TJt=i 

subject to (M+j M~) € /C(n) 

Linear program for computing the Khintchine constant K{a) for a G M" 


Proof. For any given a G M”, we reduce the computation of K{a) - the Khintchine constant for a - to a 
linear optimization problem over the Khintchine polytope /C. Since our reduction will use two signals cj+ 
and (T_ which correspond to the sign o^ 9 ■ a, we will use to denote the two matrices in the 

signature in lieu of (M^, M^). Moreover, we use the two action types +1 and —1 to index the columns of 
each matrix. For example, is the joint probability of signal £ 7 + and the event that the ith action has 

type — 1 . 

We claim that the Kintchine constant K{a) equals the optimal objective value of the implicitly-described 
linear program ®. We denote this optimal objective value by OPT{LP ©j. We first prove that K{a) < 
OPT{LP ©). Consider a signaling scheme ip in the Kintchine-like persuasion setting which simply out¬ 
puts (Tsign(e-a) for each state of nature 9 G {±1}"^ (breaking tie uniformly at random if 0 • a = 0). Since 9 is 
drawn uniformly from {±1}"' and sign{9 ■ a) = —sign{—9 ■ a), this scheme outputs each of the signals a- 
and £ 7 + with probability Consequently, the two-signal signature of (/? is a point in K,. Moreover, evaluating 
the objective function of LP © on the two-signal signature {M~^, M~) of p yields K{a) = Pie[\9 ■ a|], as 
shown below. 


^[16* • a|] = W ■ a|£7+] • Pr(£7+) -f £[-0 • o|£7_] • Pr(£7_) 

u u u 


^aiE[0i|£7+] • Pr(£7+) - ^aiE[0i|£7_] x Pr(£7_ 
i=l i=l 

^ ^ai[Pr(0i = 1|£7+) - Pr(0i = -1|£7+)] • Pr(£7+) 

n . 

- f ai[Pr(0i = 1|£7_) - Pr(0i = -1|£7_)] • Pr(£7_) 
i=l ^ 

(ai[Pr{9i = 1,£7+) - Pr(0i = -l,£7+)]"j - ^ (ai[Pr{9i = 1,£7_) - Pr(0i = -1,£7_)] 


2 = 1 


2=1 


2 = 1 2 = 1 

This concludes the proof that K{a) < OPT{LP ©). 

Now we prove K[a) > OPT{LP ©). Take any signaling scheme which uses only two signals £ 7 + and 
£ 7 _, and let be its two-signal signature. Notice, however, that £ 7 + now is only the “name” of the 

signal, and does not imply that 0 • a is positive. Nevertheless, it is still valid to reverse the above derivation 
until we reach 


= E[0 • a|£7+] • Pr(£7+) + E[-0 • a|£7_] • Pr(£7_). 

2 = 1 2 = 1 

Since 0 • a and — 0 • a are each no greater than |0 • a|, we have 

E[0 • a|£7+] • Pr(£7+) -|- E[—0 • £i|£7_] • Pr(£7_) < E[|0 • a\ \ £7+] • Pr(£7+) -|- E[|0 • a\ \ £7_] • Pr(£7_) 

9 9 9 9 

= n\9-a\]=K{a). 

a 
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That is, the objective value of LP ® is upper bounded by K{a), as needed. 


□ 


Before we proceed to present the reduction from the membership problem for /C to optimal persuasion, 
we point out an interesting corollary of Lemma IC31 

Corollary C.4. Let V be the polytope of realizable signatures for a persuasion problem with n i.i.d. actions 
and m types (see Section\^. Linear optimization over V is 4j^P-hard, and this holds even when m = 2. 

Proof Consider the Khintchine-like persuasion setting. It is easy to see that the Khintchine polytope /C can 
be obtained from V by adding the constraints = 0 for i > 3 and Mfl + ^ for i G [n], followed 

by a simple projection. Therefore, the membership problem for /C can be reduced in polynomial time to the 
membership problem for V, since the additional linear constraints can be explicitly checked in polynomial 
time. By the polynomial-time equivalence between optimization and membership, it follows that general 
linear optimization over V is ^P-hai‘d. □ 


Remark C.5. It is interesting to compare Corollarv \C.4\ to single item auctions with Ltd. bidders, where 
the problem does admit a polynomial-time separation oracle for the polytope of realizable signatures via 
Border’s Theorem hlA UaI and its algorithmic properties lil/. In contrast, the polytope of realizable sig¬ 
natures for Bayesian persuasion is ^P-hard to optimize over. Nevertheless, in Section\^we were indeed able 
to compute the optimal signaling scheme and sender utility for persuasion with i.i.d. actions. Corollarv \C.4\ 
conveys that it was crucial for our algorithm to exploit the special structure of the persuasion objective and 
the symmetry of the optimal scheme, since optimizing a general objective over V is HP-hard. 

Reduction 


We now present a reduction from the membership problem for the Khintchine polytope to the computation of 
optimal sender utility for persuasion with independent actions. As the output of our reduction, we construct a 
persuasion instance of the following form. There are n + 1 actions. Action 0 is special - it deterministically 
results in sender utility e and receiver utility 0. Here, we think of e > 0 as being small enough for our 
arguments to go through. The other n actions are regular. Action i > 0 independently results in sender 
utility —Oj and receiver utility Oj with probability \ (call this the type 1*), or sender utility —bi and receiver 
utility bi with probability ^ (call this the type 2j). Note that the sender and receiver utilities are zero-sum 
for both types. Notice that, though each regular action’s type distribution is uniform over its two types, the 
actions here are not identical because the associated payoffs — specified by a* and bi for each action i — 
are different for different actions. Since the special action is deterministic and the probability of its (only) 
type is 1 in any signal, we can interpret any (M^, M^) G /C(n) as a two-signal signature for our persuasion 
instance (the row corresponding to the special action 0 is implied). For example. Ml 2 is the joint probability 
of the first signal and the event that action i has type 2j. Our goal is to reduce membership checking for /C(n) 
to computing the optimal expected sender utility for a persuasion instance with carefully chosen parameters 
{aillLi- and 6. 

In relating optimal persuasion to the Khintchine polytope, there are two main difficulties: (1) /C consists 
of two-signal signatures, so there should be an optimal scheme to our persuasion instance which uses only 
two signals; (2) To be consistent with the definition of /C, such an optimal scheme should send each signal 
with probability exactly i. We will design specific e, Oj, bi fo accomplish bofh goals. 

For nofafional convenience, we will again use (M+, M~) fo denofe a fypical elemenf in /C instead of 
because, as we will see later, the two constructed signals will induce positive and negative sender 
utilities, respectively. Notice that there are only n degrees of freedom in (M+, M“) G /C. This is because 
M~^ M~ is the all-^ matrix in corresponding to the prior distribution of states of nature (by the 

definition of realizable signatures). Moreover, -|- Mf 2 = ^ for all i G [n] (by the definition of 1C). 
Therefore, we must have 


M+ = M~ = - MX 

2,1 1,2 o 1,2 
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This implies that we can parametrize signatures G /C by a vector x G [0, where = 

M ~2 = Xi and M ^2 = ~ I “ z G [n]. For any x G [0, let M.{x) denote the signature 

(M+, M~) defined by x as just described. 

We can now restate the membership problem for /C as follows: given x G [0, determine whether 
A4(x) G 1C. When any of the entries of x equals 0 or | this problem is trivialjj so we assume without 
loss of generality that x G (0, Moreover, when Xj = | for some i, it is easy to see that a signaling 
scheme with signature M.{x), if one exists, must choose its signal independently of the type of action i, and 
therefore M.{x) G /C(n) if and only if M.{x^i) G /C(n — 1). This allows us to assume without loss of 
generality that Xj / | for all i. 

Given x G (0, with Xi ^ \ for all i, we construct specific e and ai,bi for all i such fhaf we 
can defermine whefher M.{x) G /C by simply looking af fhe opfimal sender ufilify in fhe corresponding 
persuasion insfance. We choose paramefers a, and fo salisfy fhe following fwo equations. 


XiO-i + (9 “ = 0 - 

.1 . 1 

(- - Xi)ai + Xibi = 


(V) 

( 8 ) 


We nofe thaf fhe above linear sysfem always has a solution when Xj 7 ^ which we assumed previously. 
We make fwo observafions abouf our choice of a* and 6 *. Firsf, fhe prior expected receiver ufilify i(aj + bi) 
equals ^ for all acfions i (by simply adding Equation ([71) and ([Hi). Second, a* and bi are bofh non-zero, and 
fhis follows easily from our assumption fhat Xj G (0, ^). 

Now we show how fo determine whether A4(x) G /C by only examining the optimal sender utility in 
the constructed persuasion instance. We start by showing that restricting to two-signal schemes is without 
loss of generality in our instance. 


Lemma C. 6 . There exists an optimal incentive-compatible signaling scheme which uses at most two signals: 
one signal recommends the special action, and the other recommends some regular action. 


Proof. Recall that an optimal incentive-compatible scheme uses re +1 signals, with signal < 7 * recommending 
action z for z = 0, 1, ...,re. Fix such a scheme, and let ai denote the probability of signal (jj. Signal cjj 
induces posterior expected receiver utility rj{ai) and sender utility Sj{ai) for each action j. For a regular 
action j 7 ^ 0, we have Sj{ai) = —rj{ai) for all i due to the zero-sum nature of our construction. Notice 
that rj((Tj) > 0 for all regular actions z 7^ 0 , since otherwise the receiver would prefer action 0 over action z. 
Consequently, for each signal ai with z 7 ^ 0, the receiver derives non-negative utility and the sender derives 
non-positive utility. 

We claim that merging signals cji, cr 2 ,..., cr„ — i.e., modifying the signaling scheme to output the same 
signal a* in lieu of each of them — would not decrease the sender’s expected utility. Recall that incentive 
compatibility implies that rj(crj) = max”^Q rj{ai). Using Jensen’s inequality, we get 


n 

y^^airijai) 

i=l 


> max 
j=o 


n 

'^oiirj{ai) 

.1=1 


(9) 


If the maximum in the right hand side expression of (IH is attained at j* = 0, the receiver will choose 
the special action 0 when presented with the merged signal cr*. Recalling that Si{ai) is non-positive for 
z 7 ^ 0, this can only improve the sender’s expected utility. Otherwise, the receiver chooses a regular action 
j* 7 ^ 0 when presented with cr*, resulting in a total contribution of receiver’s expected 

®If Xi is 0 or f, then M{x) € fC if and only if Xj = j for all j f i. This is because the corresponding signaling scheme must 
choose its signal based solely on the type of action i. 


24 





utility from the merged signal, down from the total contribution of by the original signals 

cji,..., an- Recalling the zero-sum nature of our construction for regular actions, the merged signal a* con¬ 
tributes = — XliLi Oiirj*{ai) to the sender’s expected utility, up from a total contribution 

of OiiSi{ai) = — Yll=i by the original signals ai ,..., an- Therefore, the sender is not worse 

off by merging the signals. Moreover, interpreting a* as a recommendation for action j* yields incentive 
compatibility. □ 

Therefore, in characterizing the optimal solution to our constructed persuasion instance, it suffices to 
analyze two-signal schemes of the the form guaranteed by Lemma 1C.61 For such a scheme, we denote 
the signal that recommends the special action 0 by it+ (indicating that the sender derives positive utility 
e), and denote the other signal by cr_ (indicating that the sender derives negative utility, as we will show). 
For convenience, in the following discussion we use the expression “payoff from a signal” to signify the 
expected payoff of a player conditioned on that signal multiplied by the probability of that signal. For 
example, the sender’s expected payoff from signal a- equals the sender’s expected payoff conditioned on 
signal fj- multiplied by the overall probability that the scheme outputs cr_, assuming the receiver follows 
the scheme’s (incentive compatible) recommendations. We also use the expression “payoff from an action 
in a signal” to signify the posterior expected payoff of a player for that action conditioned on the signal, 
multiplied by the probability that the scheme outputs the signal. For example, the receiver’s expected payoff 
from action i in signal a+ equals «+ • rj(cr-|_), where ri{a+) is the receiver’s posterior expected payoff from 
action i given signal cr_|_, and a_|_ is the overall probability of signal cr_|_. 

Lemma C.7. Fix an incentive-compatible scheme with signals a- and a^ as described above- The sender’s 
expected payoff from signal a- is at most — i. Moreover, if the sender’ expected payoff from a- is exactly 
— then for each regular action i the expected payoff of both the sender and the receiver from action i in 
signal a+ equals 0. 

Proof Assume that signal < 7 + [fj_] is sent with probability [«_] and induces posterior expected receiver 
payoff rj((T+) [rj((T_)] for each action i. Recall from our construction that the prior expected payoff of each 
regular action i 7 ^ 0 equals Since the prior expectation must equal the expected posterior 

expectation, it follows that a+ • ri{a+) a- ■ ri{a-) = ^ when i is regular. The receiver’s reward from the 
special action is deterministically 0 , and therefore incentive compatibility implies that rj(cr+) < 0 for each 
regular action i. It follows that a_ • rj(iT_) = ^ — a+ • ri{ajf) > ^ for regular actions i. In other words, 
the receiver’s expected payoff from each regular action in signal cr_ is at least By the zero-sum nature of 
our construction, the sender’s expected payoff from each regular action in signal cr_ is at most — Since 
a- recommends a regular action, we conclude that the sender’s expected payoff from cr_ is at most — i. 

Now assume that the sender’s expected payoff from cr_ is exactly — i. By the zero-sum property, 
incentive compatibility, and the above-established fact that • ri(cj_) > | for regular actions i, it follows 
that the receiver’s expected payoff from each regular action in signal cj_ is exactly Recalling that a+ • 
ri(cr+) -|- a_ • ri{a-) = ^ when i is regular, we conclude that the receiver’s expected payoff from a regular 
action in signal cr_|_ equals 0. By the zero-sum property for regular actions, the same is true for the sender. 

□ 


The key to the remainder of our reduction is to choose a small enough value for the parameter e — 
the sender’s utility from the special action — so that the optimal signaling scheme satisfies the property 
mentioned in Lemma IC^ The sender’s expected payoff from signal a- is exactly equal to its maximum 
possible value of — In other words, we must make e so small so that the sender prefers to not sacrifice 
any of her payoff from a- in order to gain utility from the special action recommended by <7+. Notice 
that this upper bound of — ^ is indeed achievable: the uninformative signaling scheme which recommends 
an arbitrary regular action has this property. We now show that a “small enough” e indeed exists. The 
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key idea behind this existence proof is the following: We start with a signaling scheme which maximizes 
the sender’s payoff from cr_ at — and moreover corresponds to a vertex of the polytope of incentive- 
compatible signatures. When e > 0 is smaller than the “bit complexity” of the vertices of this polytope, 
moving to a different vertex — one with lower sender payoff from a- — will result in more utility loss 
from cj_ than utility gain from (j+. We show that e > 0 with polynomially many bits suffices, and can be 
computed in polynomial time. 

Let V2 be the family of all realizable two-signal signatures (again, ignoring action 0 ). It is easy to see 
that V2 is a polytope, and importantly, all entries of any vertex of P2 are integer multiples of This is 
because every vertex of 1^2 corresponds to a deterministic signaling scheme which partitions the set of states 
of nature, and every state of nature occurs with probability 1 / 2 ”. Asa result, all vertices of 1^2 have 0 {n) 
bit complexity. 

To ease our discussion, we use a compact representation for points in 1 ^ 2 - In particular, any point 
in V2 can be captured by n -|- 1 variables: variable p denotes the probability of sending signal cr+, and 
variable pi denotes the joint probability of signal <7+ and the event that action i has type 1 *. It follows 
that joint probability of type 2 ^ and signal is p — pi, and the probabilities associated with signal a- 
are determined by the constraint that M+ -|- M~ is the all-i matrix. With some abuse of notation, we use 
■^{Pi y) = M~) to denote the signature in V2 corresponding to the probability p and n-dimensional 

vector y. Now we consider the following two linear programs. 


maximize 

pe + u 




subject to 

M{p,y) € V2 





PiUi + {p- Pi)bi < 0, 

for z 

= 1,.. 

. ,n. 


u< -[il- yi)ai + {\-p + yi)hi], 

for i 

= 1,.. 

. ,n. 

maximize 

u 




subject to 

M{p,y) € V2 





PiUi + {p- Pi)bi < 0, 

for i 

= 1,.. 

. ,n. 


u< yi)ai + {\-p + yi)hi], 

for i 

= 1 ,.. 

. ,n. 


( 10 ) 


( 11 ) 


Linear programs (flOl) and (fTTl) are identical except for the fact that the objective of LP (fTOl) includes the 
additional term pe. LP (fTOl) computes precisely the optimal expected sender utility in our constructed persua¬ 
sion instance: The first set of inequality constraints are the incentive-compatibility constraints for the signal 
cr+ recommending action 0 ; The second set of inequality constraints state that the sender’s payoff from 
signal (T_ is the minimum among all actions, as implied by the zero-sum nature of our construction; The 
objective is the sum of the sender’s payoffs from signals and cj_ . Notice that the incentive-compatibility 
constraints for signal cr_, namely (^ — yj)ai + (^ —p+yi)bi > 0 for all z 7^ 0, are implicitly satisfied because 

\ai + \hi = i by our construction and (i-yi)aj-b( 5 -p-bi/i)^i = \ai + \bi-[piai + {p-pi)hi] > i-0 > 0. 

On the other hand, LP (fTTl) maximizes the sender’s expected payoff from signal a-. Observe that the opti¬ 
mal objective value of LP (fTTl) is precisely —^ because u < —[(5 — yi)ai + {\ — p + Pi)hi] < —^ for all 
z 7^ 0, and equality is attained, for example, at p = 0 and z/ = 0. 

Let 1^2 be the set of all feasible (u,Ai(p, y)) for LP (fTOl) (and LP (HU)). Obviously, 1^2 is a polytope. We 
now argue that all vertices of P2 have bit complexity polynomial in n and the bit complexity of 03 € (0, |)”. 
In particular, denote the bit complexity of x by 1 . Since ai,bi are computed by a two-variable two-equation 
linear system involving Xi (Equations (| 7 ]) and (|8])), they each have 0{i) bit complexity. Consequently, all 
the explicitly described facets of 7^2 have 0{i) bit complexity. Moreover, since each vertex of V2 has 0{n) 
bit complexity, each facet of V2 then has 0{n^) bit complexity, i.e., the coefficients of inequalities that 
determine the facets have 0(n^) bit complexity. This is due to the fact that facet complexity of a rational 


polytope is upper bounded by a cubic polynomial of the vertex complexity and vice versa (see, e.g., i 44 |] l. 
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To sum up, any facet of polytope V 2 has bit complexity 0{n^ + 1), and therefore any vertex of V 2 has 
0{n^£^) bit complexity. 

Let the polynomial B{n, i) = 0{itP£^) be an upper bound on the maximum bit complexity of vertices 
of V 2 - Now we are ready to set the value of e. LP (fTOl) always has an optimal vertex solution which we 
denote as {u*,A4*). Recall that u < —^ for all points (u,A4(p, y)) in V 2 and tt = —| is attainable at some 
vertices. Since all vertices of V 2 have B{n, 1) bit complexity, [u*, M*) must either satisfy either u* = —^ 
or u* < — ^ Therefore, it suffices to set e = which is a number with polynomial bit 

complexity. As a result, any optimal vertex solution to LP (ITOl) must satisfy u* = — since the loss incurred 
by moving to any other vertex with u < — | can never be compensated for by the other term pe < e. 

With such a small value of e, the sender’s goal is to send signal < 74 . with probability as high as possible, 
subject to the constraint that her utility from a- is precisely — i. In other words, signal cj+ must induce 
expected receiver/sender utility precisely 0 for each regular action i / 0 (see Lemma IC^ . This character¬ 
ization of the optimal scheme now allows us to determine whether M.{x) € /C by inspecting the sender’s 
optimal expected utility. The following Lemma completes our proof of Theorem 14.11 

Lemma C.8. Given the small enough value ofe described above, the sender’s expected utility in the optimal 
signaling scheme for our constructed persuasion instance is at least ^(e — 1) if and only if J\A.{x) € /C. 

Proof. <^=: If Mix) G /C, then by our choice of a*, 6 * (recall Equations (|7]l and ([H)), the signaling scheme 
implementing A4{x) is incentive compatible, the sender’s payoff from signal £ 7 + is ^e, and her payoff from 
cj_ is — Therefore, the optimal sender utility is at least 

=>: Let Ai(p, y) be the signature of a vertex optimal signaling scheme in LP (fTOl) . By our choice of e 
we know that the sender payoff from signal cr_ must be exactly — Therefore, to achieve overall sender 
utility at least signal c 7 + must be sent with probability p > \, and the receiver’s payoff from each 

regular action i / 0 in signal cj+ is exactly 0. That is, j/jOj + (p — yi)bi = 0. By construction, we also have 
that Xitti + (0.5 — Xi)bi = 0 and a*, bi / 0, which imply that ^ and, furthermore, that yi > Xi 

since p > ^. Now let p be a signaling scheme with the signature A4(p, y). We can post-process ip so it 
has signature M.(x) as follows: whenever p outputs the signal £ 7 +, flip a biased random coin to output 
with probability ^ and output cj_ otherwise. By using the identity ^ it is easy to see that this 

adjusted signaling scheme has signature M.{x). □ 
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D Omitted Proofs from Section |5] 

D.l A Bicriteria FPTAS 

Proof of Lemma HjH 

Fix e, K, and A, and let (p denote the resulting signaling scheme implemented by Algorithm |2] Let 0 ~ A 
denote the input to ip, and a ~ ip{9) denote its output. First, we condition on the empirical sample A = 
{01 ,..., 6k} without conditioning on the index i of the input state of nature 9, and show that e-incentive 
compatibility holds subject to this conditioning. The principle of deferred decisions implies that, subject to 
this conditioning, 9 is uniformly distributed in A. By definition of linear program dUl, the signaling scheme 
p computed in Step [3] is e-incentive compatible scheme for the empirical distribution A. Since a ~ (p{9) 
and 9 is conditionally distributed according to A, this implies that all e-incentive compatibility constraints 
conditionally hold; formally, the following holds for each pair of actions i and j: 

E[ri{9)\a = cjj, A] > E[rj{9)\a = cJi, A] - e 

Removing the conditioning on A and invoking linearity of expectations shows that ip is e-incentive 
compatible for A, completing the proof. 

Proof of Lemma 1531 

As in the proof of Lemma l531 we condition on the empirical sample X = {9i,... ,9k} and observe that 9 is 
uniformly distributed in A after this conditioning. The conditional expectation of sender utility then equals 
Ylk=i SILi ^^{^k-,<yi)si{9k), where p is the signaling scheme computed in Step [3] based on A. Since 
this is precisely the optimal value of the LP (011 solved in Step [3l removing the conditioning and invoking 
linearity of expectations completes the proof. 

Proof of Lemma 113] 

Recall that linear program ([Til solves for the optimal incentive compatible scheme for A. It is easy to see 
that the linear program (01) solved in step [3] is simply the instantiation of LP ([T]) for the empirical distri¬ 
bution A consisting of K samples from A. To prove the lemma, it would suffice to show that the optimal 
incentive-compatible scheme p* corresponding to LP ([T]) remains e-incentive compatible and e-optimal for 
the distribution A, with high probability. Unfortunately, this approach fails because polynomially-many 
samples from A are not sufficient to approximately preserve the incentive compatibility constraints cor¬ 
responding to low-probability signals (i.e., signals which are output with probability smaller than inverse 
polynomial in n). Nevertheless, we show in Claim iDril that there exists an approximately optimal solution 
p to LP ([T]) with the property that every signal ai is either large, which we define as being oufpuf by p wifh 
probabilify af leasf ^ assuming 0 ~ A, or honest in fhaf only sfafes of nafure 0 wifh i S argmax^ rj{9) 
are mapped fo if. If is easy fo see fhaf sampling preserves incenfive-compafibilify exacfly for honesf signals. 
As for large signals, we employ fail bounds and fhe union bound fo show fhaf polynomially many samples 
suffice fo approximafely preserve incentive compafibilify (Claim lDj2l ). 

Claim D.l. There is a signaling scheme p which is incentive compatible for A, induces sender utility 
Us{p, A) > OPT — ^ on X, and such that every signal ofp is either large or honest. 

Proof. Lef p* be fhe optimal incenfive-compafible scheme for A — i.e. fhe optimal solution fo LP ([T]). We 
call a signal a small if if is oufpuf by p* with probability less than i.e. if tr) < and 

otherwise we call it large. Let p be the scheme which is defined as follows: on inpuf 0, if firsf samples a ~ 
p*{9)\ if a is large fhen p simply oufpufs a, and ofherwise if recommends an action maximizing receiver 
ufilify in sfafe of nafure 0 — i.e., oufpufs (jj/ for i' € argmaxj ri{9). If is easy fo see fhaf every signal of p 
is either large or honest. Moreover, since p* is incentive compatible and p only replaces recommendations 
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Rainy 

Sunny 

Walk 

l-(5 

1 

Drive 

1 

0 


Table 1: Receiver’s Payoffs in Rain and Shine Example 

of (/ 9 * with “honest” recommendations, it is easy to check that cp is incentive compatible for A. Finally, since 
the total probability of small signals in ip* is at most |, and utilities are in [— 1 , 1 ], the sender’s expected 
utility from ip is no worse than | smaller than her expected utility from ip*. □ 

Claim D.2. Let (p be the signaling scheme from Claim ^H] With probability at least 1 — | over the sample 
A, ip is e-incentive compatible for A, and moreover Us(p, A) > Us{p, A) — 

Proof Recall that p is incentive compatible for A, and every signal is either large or honest. Since A is a set 

of samples from A, it is easy to see that incentive compatibility constraints pertaining to the honest signals 

continue to hold over A. It remains to show that incentive compatibility constramts for large signals, as well 

as expected sender utility, are approximately preserved when replacing A with A. 

Recall that incentive-compatibility requires that 'E0[p{9,ai){ri{9) — rj{9))] > 0 for each i,j £ [n]. 

Moreover, the sender’s expected utility can be written as (p{9,ai)si{9)]. The left hand side of 

each incentive compatibility constraint evaluates the expectation of a fixed function of 9 with range [—2, 2], 

whereas the sender’s expected utility evaluates the expectation of a function of 9 with range in [— 1 , 1 ]. 

Standard tail bounds and the union bound, coupled with our careful choice of the number of samples K, 

imply that replacing distribution A with A approximately preserves each of these + 1 quantities to within 
2 

an additive error of with probability at least 1 — f. This bound on the additive loss translates to e- 
incentive compatibility for the large signals, and is less than the permitted decrease of | for expected sender 
utility. □ 

The above claims, coupled with the fact that sender payoffs are bounded in [—1,1], imply that the 
expected optimal value of linear program (|4ll is at least OPT — e, as needed. 

D.2 Information-Theoretic Barriers 

Impossibility of Incentive Compatibility (Proof of Theorem|5i5](a)) 

Consider a setting with two states of nature, which we will conveniently refer to as rainy and sunny. The 
receiver, who we may think of as a daily commuter, has two actions: walk and drive. The receiver slightly 
prefers driving on a rainy day, and strongly prefers walking on a sunny day. We summarize the receiver’s 
payoff function, parametrized by <5 > 0, in Table[I] The sender, who we will think of as a municipality with 
black-box sample access to weather reports drawn from the same distribution as the state of nature, strongly 
prefers that the receiver chooses walking regardless of whether it is sunny or rainy: we let Syjaik = 1 and 
Sdrive = 0 in both states of nature. 

Let \r be the point distribution on the rainy state of nature, and let A^ be such that [rainy] = 
and Pfa,, [sunny] = It is easy to see that the unique direct incentive-compatible scheme for A^ always 
recommends driving, and hence results in expected sender utility of 0. In contrast, a simple calculation 
shows that always recommending walking is incentive compatible for A*, and results in expected sender 
utility 1 . If algorithm A is incentive compatible and c-optimal for a constant c < 1, then Al(Ar) must never 
recommend walking whereas Al(A<j) must recommend walking with constant probability at least (1 — c) 
overall (in expectation over the input state of nature 0 ~ A^ as well as all other internal randomness). 
Consequently, given a black box distribution V £ {A^, Ag}, evaluating A{'D, 9) on a random draw 0 ~ P 
yields a tester which distinguishes between A,, and A^ with constant probability 1 — c. 
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Pr[0i] 

Pr[ 02 ] 

Pr[03] 

A 

1 - 26 

26 

0 

A' 

1 - 26 

6 

6 


Table 2: Two Distributions on Three Actions 

Since the total variation distance between and is 0{6), it is well known (and easy to check) that 
any black-box algorithm which distinguishes between the two distributions with 12 ( 1 ) success probability 
must take 12(|) samples in expectation when presented with one of these distributions. As a consequence, 
the average-case sample complexity of A on either of A^ and As is D(|). Since <5 > 0 can be made arbitrarily 
small, this completes the proof. 

Impossibility of Optimality (Proof of Theorem l53l (b)) 

Consider a setting with three actions {1, 2, 3} and thr'ee corresponding states of nature 0i, 6 * 2 , 6 * 3 . In each 
state 6 i, the receiver derives utility 1 from action i and utility 0 from the other actions. The sender, on the 
other hand, derives utility 1 from action 3 and utility 0 from actions 1 and 2. For an arbitrary parameter 
5 > 0, we define two distributions A and A' over states of nature with total variation distance <5, illustrated 
in Table |2l 

Assume algorithm A is optimal and c-incentive compatible for a constant c < \. The optimal incentive- 
compatible scheme for A' results in expected sender utility 35 by recommending action 3 whenever the state 
of nature is 6*2 or @ 3 , and with probability when the state of nature is 6 * 1 . Some calculation reveals 
that in order to match this expected sender utility subject to c-incentive compatibility, signaling scheme 
ip' = Al(A') must satisfy ip'{ 92 ,o' 3 ) > /r for // = 1 — 4c > 0. In other words, tp' must recommend action 
3 a constant fraction of the time when given state 62 as input. In contrast, since c < ^ it is easy to see that 
ip = ^(A) can never recommend action 3: for any signal, the posterior expected receiver reward for action 3 
is 0, whereas one of the other two actions must have posterior expected receiver reward at least It follows 
that given D € {A, A'}, a call to A{'D, 62 ) yields a tester which distinguishes between A and A' with constant 
probability p. Since A and A' have statistical distance 5, we conclude that the worst case sample complexity 
of A on either of A or A' is r2( j). Since 5 > 0 can be made arbitrarily small, this completes the proof. 
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